surrogate key in pyspark